Distance measures to compare real and ideal quantum processes 
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With growing success in experimental implementations it is critical to identify a "gold standard" 
for quantum information processing, a single measure of distance that can be used to compare and 
contrast different experiments. We enumerate a set of criteria such a distance measure must satisfy 
to be both experimentally and theoretically meaningful. We then assess a wide range of possible 
measures against these criteria, before making a recommendation as to the best measures to use in 
characterizing quantum information processing. 
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I. INTRODUCTION 

Many real-world imperfections arise when experimen- 
tally performing a quantum information processing task. 
These may arise either in the creation or measurement 
of a quantum state, or in the manipulation of the state 
via some quantum process. It is important to quantita- 
tively measure and characterize these imperfections in a 
way that is theoretically meaningful and experimentally 
practical. 

How can this be done? Quantum states can be 
completely determined using quantum state tomogra- 
phy Pj and compared using a variety of well-known 
measures Q . Quantum processes can be measured using 
an analogous procedure called quantum process tomogra- 
phy @, 0j HI- However, the problem of developing quan- 
titative measures to compare real and idealized quantum 
processes has not been comprehensively addressed. 

Ideally there would be a single good measure, a "gold 
standard" [||, enabling sensible comparison of different 
experimental implementations of quantum information 
processing, and agreed upon by experimentalists and the- 
orists alike. We will refer to candidates for such a gold 
standard as "distance measures" for quantum processes, 
or as "error measures" , when we want to stress the com- 
parison of real and idealized processes. 

Such an error measure would be extremely useful both 
when comparing experiments with the theoretical ideal, 
and in comparing different experiments that attempt to 
perform the same task. Existing experiments in quan- 
tum information processing have typically been assessed 
on a rather ad hoc basis. For example, some implementa- 
tions of quantum logic gates have relied on demonstrating 
that those gates act in the correct way on computational 
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basis states (i.e., verifying the truth table of the gate), 
and a few superposition states. Such demonstrations 
are important, but it is clear that a figure of merit that 
is standardized, theoretically well motivated and experi- 
mentally practical would be a considerable step forward. 
Parenthetically, we note that such a measure would also 
be of great use in concretely connecting real experiments 
to results such as the fault-tolerance threshold for quan- 
tum computation [8[. 

The purpose of this paper is to comprehensively ad- 
dress the problem of developing such error measures. 
There is a sizeable previous literature on this subject, but 
we believe that there has been a consistent gap between 
work motivated primarily by theoretical considerations, 
and work constrained by experimental realities. Our pa- 
per aims to address both theoretical and experimental 
desiderata. 

The key to our work is to introduce a list of six sim- 
ple, physically motivated criteria that should be satis- 
fied by any good measure of distance between quantum 
processes. These criteria enable us to eliminate many 
approaches to the definition of an error measure that a 
priori appear highly plausible. 

The criteria are as follows. Suppose A is a candidate 
measure of the distance between two quantum processes. 
Such processes are described by maps between input and 
output quantum states, e.g., p out = £(pin), where the 
map £ is known as a quantum operation HQ- Physi- 
cally, A(£, J-) may be thought of in two ways: as a mea- 
sure of error in quantum information processing when 
one wants to do the ideal process T but does £ instead; 
or of distinguishability between the two processes £ and 
T . We believe that any such measure must satisfy the 
following six properties, motivated by both physical and 
mathematical concerns. 

(1) Metric: A should be a metric. This requires three 
properties: (i) A(£, T) > with A(£, T) — if and only 
if £ = T; (ii) Symmetry: A{£,T) = A{T,£); and (hi) 
the triangle inequality A(£,G) < A(£,T) + A(T,G). 

(2) Easy to calculate: it should be possible to evaluate 
A in a direct manner. 
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(3) Easy to measure: there should be a clear and 
achievable experimental procedure for determining the 
value of A. 

(4) Physical Interpretation: A should have a well- 
motivated physical interpretation. 

(5) Stability [13]: A(J® £,J(g).F) = A(£,.T), where 1 
represents the identity operation on an additional quan- 
tum system. Physically, this means that unrelated ancil- 
lary quantum systems do not affect the value of A. 

(6) Chaining: A(f 2 o £ b f 2 o < A(£i,.Fi) + 
A(£2, Thus, for a process composed of many smaller 
steps, the total error will be less than the sum of the er- 
rors in the individual steps. 

The chaining and stability criteria are key properties 
for estimating the error in a complex quantum informa- 
tion processing task. Because quantum information pro- 
cessing tasks are typically broken down into a sequence of 
simpler component operations, a conservative bound on 
the total error can be found by simply analyzing the indi- 
vidual components. This is critical for applications such 
as quantum computation, where full process tomography 
on an n-qubit computation requires exponentially many 
measurements, and is thus infeasiblc. Chaining and sta- 
bility enable one to instead benchmark the constituent 
processes involved in the computation, which can then 
be used to infer that the entire computation is robust. 

Many other properties follow from these six criteria. 
For example, from the metric and chaining criteria we see 
that A(1Z o £,TZ o T) < A(£, J 7 ), where 1Z is any quan- 
tum operation. This corresponds to the requirement that 
post-processing by 1Z cannot increase the distinguishabil- 
ity of two processes £ and T . Another elementary con- 
sequence of the metric and chaining criteria is unitary 
invariance, i.e., A(U o £ o V,U o To V) = A(£ ,JF), where 
Li and V are unitary operations. 

For both theoreticians and experimentalists, there are 
strong motivations to find a gold standard satisfying 
these criteria — the need for a physically sensible way of 
evaluating the performance of a quantum process, and 
the need to compare the success of a theoretical model 
to the operation of a real, experimental system. For the 
experimentalist, however, there is also another important 
consideration. That is the need for diagnostic measures 
which can be used to build insight into the source of im- 
perfections in experimental implementations. Diagnostic 
measures may not necessarily be good candidates for our 
sought-after gold standard — they may fail to satisfy 
one or more of our criteria — but they still may be ex- 
tremely useful in the experimental context. Thus, some 
of the measures we discard as unsuitable for use as a gold 
standard may still be useful as diagnostic measures. Fur- 
thermore, it is not difficult to construct other examples 
of useful diagnostic measures, different to any considered 
in this paper. The detailed investigation of such diagnos- 
tic measures is, however, beyond the scope of the present 
paper. 

Prior work: The principal contribution of our paper is 
to comprehensively evaluate many plausible error mea- 



sures for quantum information processing, within the 
broad framework of the criteria we have identified. So 
far as we are aware, none of the prior work has surveyed 
and compared error measures against such a broad array 
of theoretical and experimental concerns. 

Error measures for quantum teleportation have re- 
ceived particular attention in the prior literature, per- 
haps spurred by controversy over which experiments 
should be regarded as definitively demonstrating the tele- 
portation effect lip . Examples of this line of devel- 
opment include |l2l [13 , [Til . fl5L [Tsl . \vh , and references 
therein. With the exception of Ref. [17| this work differs 
from ours in that it is focused primarily on the prob- 
lem of teleportation. Reference [1_7[ has a more general 
focus, but is not primarily concerned with the develop- 
ment of error measures, but rather with the question of 
when quantum information processing can be modeled 
classically. 

More mathematical investigations of error measures 
have also been mounted, especially in the context of 
quantum communication and fault-tolerant quantum 
computation. Examples of this work include [13, [H, 
13 M, EH, E3, E3, E3, ES El, and references therein. 
This work (often embedded in some larger investigation) 
typically focuses on one or a few measures of specific in- 
terest for the problem at hand. These papers thus differ 
from our work in that they don't attempt a comprehen- 
sive survey of possible error measures against some set of 
abstract criteria; nor, typically, do they address experi- 
mental criteria such as ease of measurement. Nonethe- 
less, while this prior work is different in character from 
ours, it has greatly informed our point of view, and we 
will have occasion to cite it on specific points throughout 
this paper. Of particular relevance is Ref. which in- 
troduced one of the key measures we use, the stabilized 
process distance, or S distance (referred to as the dia- 
mond norm in Ref. (lo|). and emphasized some of the 
important properties satisfied by that measure. 

Structure of the paper: Sees. Hi] and IIIII summarize 
background material on quantum operations and dis- 
tance measures for quantum states. 

Section IIVI is the core of the paper, comprehensively 
surveying possible approaches to the definition of error 
measures. Our strategy is to cast a wide net, considering 
many different possible approaches to the definition of a 
distance measure, and then to use our list of criteria to 
eliminate as many approaches as possible. This means 
a certain amount of tedium as we propose and then re- 
ject certain a priori plausible candidate error measures. 
The benefit of going through this process of elimination is 
considerable, however. First, it gives us confidence that 
the few measures we identify as particularly promising 
should be preferred over all other measures. Indeed, we 
quickly eliminate all but four of the measures we define as 
follows: the Jamiolkowski process fidelity (J fidelity), the 
Jamiolkowski process distance (J distance), the stabilized 
process fidelity (S fidelity), and the stabilized process dis- 
tance (S distance). Second, in several instances we show 
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that error measures proposed previously in the literature 
(in one case, by one of the authors of this paper) should 
be rejected as inadequate. 

Section [V] applies the four promising measures identi- 
fied in Sec. HVI to the concrete problem of quantum com- 
putation, showing that each measure has a useful opera- 
tional interpretation in terms of the success or failure of 
a quantum computation. 

Section fVTl concludes the paper with a summary of our 
results, and the identification of the S distance and the 
S fidelity as the two measures whose properties make 
them the most attractive candidates for use as a gold 
standard in quantum information processing. We do not 
make a final recommendation as to which of these two 
measures should be used, since they have extremely sim- 
ilar strengths and weaknesses. However, we do discuss 
and make definite recommendations regarding the re- 
porting of quantum information processing experiments. 
Furthermore, we sketch future research directions which 
may ameliorate some of the weaknesses of one or both 
measures, and which may therefore make it possible to 
definitively choose a single measure as a gold standard. 

II. DESCRIBING QUANTUM PROCESSES 

Quantum operations describe the most general phys- 
ical processes that may occur in a quantum system [3J, 
including unitary evolution, measurement, noise, and de- 
coherence. Any quantum operation may be given the 
operator-sum representation relating input pi n and out- 
put pout states, 

Pout = £ (Pin) = X! E iP™ E } , (!) 

i 

where the operators Ej arc known as operation elements, 
and obey the condition that J2j E j E j < 1 Notc 
that the operation elements {Ej} completely describe 
the effect of the process. We will mostly be concerned 
with the case of trace-preserving operations, for which 

• EjEj = I. Physically, this corresponds to the re- 
quirement that £ represents a physical process without 
post-selection [28| . Many of our results extend easily to 
the case of non trace-preserving operations, but to ease 
the exposition we assume processes are trace-preserving 
unless otherwise noted. 

The operator-sum representation has the drawback 
that it is not unique, in the sense that there is a freedom 
in the choice of operation elements This is inconve- 
nient if we are trying to compare two processes. To allevi- 
ate this, let us fix a basis {Aj} for the space of operators, 
choosing for convenience a basis orthonormal under the 
Hilbert-Schmidt inner product, i.e., tr (A^Ak) = Sjk [2^ |. 
We can use this basis to expand the operation elements, 
E i = J2 m a jmA m , and rewrite Eq. {!]): 

£(p) = X>eWW4 (2) 

mn 



where (xs)mn = J2j a jmO.* n are the elements of the pro- 
cess matrix, \£- Equation @ tells us that the process 
matrix completely describes the action of the quantum 
process. The big advantage of the process matrix repre- 
sentation is that, unlike the operator-sum representation, 
once the basis {A,} is chosen the process matrix can be 
shown to be unique to the process [30| : i.e., it depends 
only on £ , not on the particular choice of operation el- 
ements {Ej}. We will not give an explicit proof of this 
fact here, but note that this result follows easily from the 
discussion below. 

The process matrix gives a convenient way of repre- 
senting the operation £. A closely- related but more ab- 
stract representation is provided by the Jamiolkowski iso- 
morphism [3]| . which relates a quantum operation £ to 
a quantum state, ps : 

p £ = [X®£](\$)($\), (3) 

where |<&) = Ylj \ j) \j) /Vd is a maximally entangled state 
of the (d-dimensional) system with another copy of itself, 
and {\j)} is some orthonormal basis set. The map £ — > 
ps is invcrtible, that is, knowledge of pg is equivalent to 
knowledge of £ (32J. This isomorphism thus allows us 
to treat quantum operations using the same tools as are 
ordinarily used to treat quantum states. For later use we 
note the useful property pe®T = Ps ® Pf- 

The state pg and the process matrix \s arc closely re- 
lated. A direct calculation shows that if one chooses the 
operator basis sets {Aj} — {\m)(n\}, then \£ = dp£, as 
matrices. Thus we shall refer to both \S and pe as the 
process matrix, and treat them interchangeably. This is 
very convenient, as ps is easy to work with mathemat- 
ically, using the expression Eq. ([3]), while the elements 
of xe have an obvious physical significance, expressed by 
Eq. ©. 

We conclude this section with a comment on our no- 
tational conventions. We often use notation like ip to de- 
note either a pure state \ip) or the corresponding density 
matrix with the meaning to be determined from 

context. Thus, for example, we may write %j) = a|0)+/3|l) 
to indicate a pure state of a single qubit, while also writ- 
ing £(ip) to indicate a quantum operation £ acting on 
the density matrix corresponding to that pure state. 



III. DISTANCE MEASURES FOR QUANTUM 
STATES 

A natural starting place for an attempt to define a 
measure of distance for quantum processes is measures 
of distance for quantum states. The quantum informa- 
tion science community has identified the trace distance 
and the fidelity as particularly important approaches to 
the definition of a distance measure for states (33|, and 
these two measures will serve as the basis for our later 
definitions of distance measures for quantum operations. 
In keeping with the aims of the paper, we don't make 
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a choice between the trace distance and the fidelity at 
the outset. Instead, our preference is to develop distance 
measures for quantum operations based on both the trace 
distance and the fidelity, and then assess them using the 
criteria discussed in the introduction. We now briefly re- 
view the basic properties of the trace distance and the 
fidelity. 

The trace distance: The trace distance between density 
matrices p and a is defined by D(p, a) = ^tr|p— o~\, where 
\X\ = VxtX. Fr om this definition it follows that the 
trace distance is a genuine metric on quantum states, 
with < D < 1. The trace distance also has many 
other attractive properties that make it a particularly 
good measure of distance between quantum states. We 
now briefly describe three of these. 

First, the trace distance has a compelling physical in- 
terpretation as a measure of state distinguishability. Sup- 
pose Alice prepares a quantum system in the state p with 
probability \ , and in the state a with probability i . She 
gives the system to Bob, who performs a POVM measure- 
ment Q to distinguish the two states. It can be shown 
that Bob's probability of correctly identifying which state 
Alice prepared is 1/2 + D(p,a)/2. That is, D(p,a) can 
be interpreted, up to the factor 1/2, as the optimal bias 
in favour of Bob correctly determining which of the two 
states was prepared. This physical interpretation follows 
from the identity D(p,a) = max B </ tr(E(p — cr)) [34| . 
where the maximum is over all positive operators E sat- 
isfying E < I. 

Second, the trace distance possesses the contractivity 
property [35|], that is, D(£{p),£(a)) < D(p,a) whenever 
£ is a trace-preserving quantum operation. This state- 
ment expresses the physical fact that a quantum process 
acting on two quantum states cannot increase their dis- 
tinguishability. Contractivity follows from the physical 
interpretation of D(p,o~) described above. 

Third, the trace distance is doubly convex, i.e., 
if pj are probabilities then D(J2jPjPjj^2jPj cr j) < 
^2jPjD(fj,aj). This inequality can be physically inter- 
preted as the statement that the distinguishability be- 
tween the states J2jPjPj an d J2jPj cr ji where j is not 
known, can never be greater than the average distin- 
guishability when j is known, but has been chosen at 
random according to the distribution pj. 

Fidelity: The fidelity between density matrices p and 
a is defined by 



F(p,v)=tr (yVWp) • (4) 

When p = tp is a pure state, this reduces to F(ip, a) = 
(ip\o~\ip), the overlap between ip and cr. 

The fidelity also has many attractive properties. It can 
be shown that < F(p, a) < 1, with equality in the sec- 
ond inequality if and only if p = a. The fidelity is thus 
not a metric as such, but serves rather as a generalized 
measure of the overlap between two quantum states. The 
fidelity is also symmetric in its inputs, F(p, cr) — F(o~, p), 



a fact that is not obvious from the definition we have 
given, but which follows from other equivalent defini- 
tions. 

There is an ambiguity in the literature in the defini- 
tion of fidelity that is worth commenting on here. Both 
the quantity defined above and its square root have been 
referred to as the fidelity, and both have many appealing 
properties [3f|. 

Nevertheless, we strongly advocate using the definition 
of Eq. (fJJ , despite the other definition being used in ref- 
erences such as Q. As we will see in Sec.[V] adopting the 
definition of Eq. (|4]) gives rise to a measure of distance 
between quantum processes with a physically compelling 
interpretation in terms of the probability of success of a 
quantum computation. Adopting the other definition of 
fidelity would make about as much sense as reporting the 
square root of the probability that the quantum compu- 
tation succeeded. 

Although not a metric, the fidelity can easily be turned 
into a metric. Two common ways of doing this are the 

Bures metric, defined by B(p, cr) = ^jl — 2y /l F(p, a), 

and the angle, defined by A(p, a) = arccos yj F(p, a). 
The origin of these metrics can be seen intuitively by 
considering the case when p and a are both pure states. 
The Bures metric is just the Euclidean distance between 
the two pure states, with respect to the usual norm on 
state space [13], while the angle is, as the name suggests, 
just the angle between the two states, with respect to the 
usual inner product on state space. 

In addition to the angle and the Bures metric we will 
find it convenient to introduce a third metric based on 
the fidelity. This metric does not seem to have been pre- 
viously recognized in the literature, but arises naturally 
later in this paper in the context of quantum computa- 
tion. It is defined by C(p, cr) = y/l — F(p, a). The only 
difficult step in proving this is a metric is the proof of 
the triangle inequality [381 ]. 

In later sections our discussion will sometimes focus 
on the fidelity, and sometimes on metrics derived from 
the fidelity. We will say that a metric A F (p, a) on state 
space is a fidelity-based metric if it is a monotonically 
decreasing function of the fidelity F(p, a). Obviously the 
angle, the Bures metric and C(-,-) arc all fidelity-based 
metrics. It is often the case that the specific details of the 
metric used are not important, and whenever possible we 
state results using the fidelity as a single unifying con- 
cept. However, sometimes it will prove advantageous to 
use the fidelity-based metrics directly. In particular, they 
have the advantage of satisfying the triangle inequality, 
which turns out to be useful proving the chaining crite- 
rion [property (6)]. 

Like the trace distance, the fidelity and its derived met- 
rics have many other nice properties. It can be shown (4p| 
that F(£(p),£(a)) > F(p,a) for any trace-preserving 
quantum operation £. We call this the monotonicity 
property of the fidelity. It follows that any fidelity-based 
metric satisfies a contractivity property analogous to that 
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satisfied by the trace distance. 

The fidelity also satisfies a property analogous to the 
double convexity of the trace distance. Precisely, the 
square root of the fidelity is doubly concave, that is, 

n^jPjPj^jPjVj) 1 ' 2 > Y.jP.I'Wr",'- This dou- 
ble concavity can be used to prove double convexity of 
certain fidelity-based metrics. In particular, supposing 
A F is a fidelity-based metric which is convex in the 
square root of the fidelity (the angle, the Bures metric 
and C(-, •) are all easily verified to have this property), 
then it is easy to verify that A F is doubly convex. 

One drawback of the fidelity is that it is difficult to find 
a compelling physical interpretation. When p and a are 
mixed states, no completely satisfactory interpretation 
of the fidelity is known (but c.f. Refs 0, 0). When 
p = 4' is a pure state, we have F(ip,a) = (ip\a\ip), the 
overlap between ip and a. Physically, we might imagine 
a is an attempt to prepare the pure state ip. In this case 
the fidelity coincides with the probability that a perfect 
measurement testing whether the state is ip will succeed. 
It is this property of the fidelity that is used in Sec. [V] 
to connect our fidelity-based error measures for quantum 
processes to the probability of success of a quantum com- 
putation. 

General comments: The fidelity is, at present, perhaps 
somewhat more widely used in the quantum information 
science community than is the trace distance. However, 
we shall see below that the trace distance and the fidelity 
have complementary advantages as a basis for developing 
measures of distance for quantum operations, and so it 
is useful to investigate both. In any case, the two mea- 
sures are, as one might expect, quite closely related. In 
particular, it is possible to show that they are related by 
the inequalities (43[: 

1 - s/F(p~, a) < D(p, a)<y/l- F(p, a). (5) 

It is not difficult to construct examples of saturation 
for both inequalities. Note that the second inequal- 
ity is always saturated for pure states, i.e., D(ip,(p) = 
\Jl — F{ip, <p) for pure states ip and (p. 



IV. ERROR MEASURES FOR QUANTUM 
PROCESSES 

Our goal in this paper is to recommend a single er- 
ror measure enabling researchers to compare the perfor- 
mance of quantum information processing experiments 
against the theoretical ideal. As the basis for such a rec- 
ommendation, in this section we comprehensively survey 
possible definitions of such error measures, and do a pre- 
liminary assessment of each measure against the criteria 
introduced earlier in this paper. 

We take three basic approaches to defining an error 
measure for processes. In Sec. IIV Al wc investigate ap- 
proaches based on the process matrix, pg. In Sec. IIV Bl wc 
investigate approaches based on the average behaviour of 



a process. Finally, in Sec. lIVCl we investigate approaches 
based on the worst- case behaviour of a process. In each 
case we investigate measures based on both the trace dis- 
tance and the fidelity. We will describe connections be- 
tween the various measures, and identify four measures of 
particular merit. The properties of these four measures 
will be discussed in more detail in the next section. 

Nomenclature: In the following treatment we shall 
use the unadorned symbol A to mean a metric between 
states. Our approach is to use state-based metrics to 
form metrics between processes, and these will also be 
represented by A but with a subscript denoting the 
method used, e.g. A ave is a process metric based on the 
average over input states. Where we need to specialize to 
a specific state-metric we will use a superscript with the 
symbol representing that metric (A, B, C , and D from 
section UTT)) . or use that symbol directly with a subscript 
for the method, e.g. A^, c = D avQ is the process metric 
based on the average trace distance. The chief departure 
from these conventions will be due to the fidelity, which 
is not a metric. We will use the notation A F to mean any 
metric derived from the fidelity (e.g. A, B, and C) and 
the symbol F with a subscript to mean a process measure 
based on fidelity, for example F ave is the average fidelity. 

A. Error measures based on the process matrix 

Suppose A(p, a) is any metric on the space of quantum 
states. A natural approach to defining a measure A pro 
of the distance between two quantum processes is 

A pro {£,T) = A(j> s ,pr). (6) 

Defining A pro in this way automatically gives A pro the 
metric property. Provided A(-, ■) is easy to calculate, 
A pro is also easy to calculate. Furthermore, since £ can 
be experimentally determined using quantum process to- 
mography, it follows that A pro can be experimentally 
measured, at least in principle. 

What about the other properties? The properties of 
stability and chaining can be obtained by making some 
natural extra assumptions about the state metric A, 
which we now describe. Suppose first that the metric 
A is stable in the sense that A{p (g) r, a ® r) = A(p, a). 
This is easily seen to be the case for the trace distance 
and for any fidelity-based metric, for example. The sta- 
bility property for A pro follows immediately: 
A pro (I ®£,I®T) = A(/9r ® Ps, Pi ® Pf) = &{pe,Pr) = 
A wo {£,T). 

The chaining property can be proved, with some 
caveats to be described below, by assuming that A(-,-) 
is contractive, i.e., A(£(p), £{o~)) < A(p,o~), for trace- 
preserving operations £ . We have already seen that this 
is a natural physical assumption satisfied by the trace 
distance and any fidelity-based metric. 

Suppose then that A is contractive with respect to 
trace-preserving operations. We claim that A pro satisfies 
the chaining property, 
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A pro (£ 2 ° £ 1 , ?2 o -^1 ) < A pro (£ 2 , Tl ) + A pro (£1 , T\) , 
provided T\ is doubly stochastic, i.e., .Fi is trace- 
preserving and satisfies T\{T) = I; this assumption is 
used at a certain point in our proof of chaining. This 
may seem like a significant assumption, since physical 
processes such as relaxation to a finite temperature are 
not doubly stochastic. However, in quantum informa- 
tion science we are typically interested in the case when 
T\ and Ti are ideal unitary processes, and we are using 
A pro to compare the composition of these two ideal pro- 
cesses to the experimentally realized process £ 2 °£i- Since 
unitary processes are automatically doubly stochastic, it 
follows that chaining holds in this case, which is the case 
of usual interest. 

The proof of chaining begins by applying the triangle 
inequality to obtain 

A pro (£ 2 = 4fco£i,Pf,of,) (7) 

+A(p £20 r 1 ,pjr 20 f l ). (8) 

Then note the easily-verified identity ps r = {F T ® 
£)($), where $ is the maximally entangled state defined 
earlier, we define T T {p) = ■ Fj pF* , and Fj are the 
operation elements for T [c.f. Eq. (JTJ)] . Applying this 
identity to both density matrices in the second term on 
the right-hand side of Eq. §8§ gives 

A pro (£ 2 °£\,Tl ofi) 
< A(P£ 2 o£ 1 ,P£ 2 oF 1 ) 

+A((F? ®£ 2 )(<P),(F? ®F 2 )(<S>)). (9) 

The double stochasticity of T\ implies that is a trace- 
preserving quantum operation. We can therefore apply 
contractivity to both the first and the second terms on 
the right-hand side of Eq. ©, giving the desired result. 

Only one property of A pro remains in question, and 
that is whether or not it has a good physical interpre- 
tation. We will see in Sec. [V] that D plo and F pro can 
both be related in a natural way to the average probabil- 
ity with which a quantum computation fails or succeeds, 
providing a good physical interpretation for these quan- 
tities. 

Although Ap r o may be calculated easily in principle 
for both the trace distance and fidelity-based approaches, 
the fidelity-based measures have some substantial advan- 
tages. The reason is that, so far as we are aware, exper- 
imentally determining D pm requires doing full process 
tomography, which for a <i-dimensional quantum system 
requires the estimation of d 4 — d 2 observable averages. 
By contrast, when U is a unitary operation it turns out 
that the fidelity F pro (£ , U) (and related error measures) 
can be determined based upon the estimation of at most 
2d 2 observable averages, and in particular, d 2 observable 
averages for qubits. This makes F pm (£,U) and related 
error measures substantially easier to determine experi- 
mentally than -Dpro- The key to proving this is the ob- 



servation [44| 

F pro (£, U) = 1 J2 HUUjU^iUj)), (10) 
j 

where the {Uj} are a basis of unitary operators orthogo- 
nal under the Hilbert-Schmidt inner product, satisfying 
tr(UjUk) = d5jk- Up to scaling we saw an example of 
such a set in Sec. [ill the n-qubit tensor products formed 
from the Pauli matrices and the identity matrix. Equa- 
tion (|10p does not provide a direct way of estimating 
-Fpro- But suppose we expand the Uj in terms of a set 
of input states, p k : Uj = ^2^o,jkPk- These input states 
must span the entire operator space, and thus there must 
be d 2 of them; we will see an explicit example below for 
two qubits. We also expand UUjU^ in terms of a set of 
observables, er;: UUjW = J2i°ji (J i- These observablcs 
must also span the entire operator space. Substitution 
into Eq. (fT0|) gives 

F pTO (£ , U) = ^J2 M k itT{ai£(p k )), (11) 

kl 

where Mm = Ylj bji a jk- This equation gives a method to 
evaluate -Fp ro : choose a spanning set of d 2 input states pk 
which can be prepared experimentally, and a set of ob- 
servables a 1 whose averages we can reliably measure; de- 
termine the matrix M = (Mki), whose elements depend 
only on known quantities (pk , <J\ , and the idealized oper- 
ation U), not on the unknown £. The non-zero matrix 
elements in M will determine which observable averages 
need to be estimated for calculating F plo (£,U). In gen- 
eral, d 4 observable averages will need to be estimated. 
However, suppose we choose some fixed set of pk, and 
then define o\ = ^2 k akiUUk.U^ [45| . In this case it is 
easily verified that Eq. (jTTJ) simplifies to: 

Fpro(£,£/) = i$>(cr fe £(p fc )), (12) 
k 

which only requires between d 2 and 2d 2 measurements. 
The drawback is that in this method we are not free to 
choose the 07; they are determined by U and the pt- 

In practical situations, certain input states and mea- 
surements are easier to use than others. We envisage 
an experimentalist choosing the set of input states and 
measurements according to convenience and using the 
prescription above to calculate which combinations are 
necessary. This in general will be less than what is re- 
quired to perform full process tomography. This direct 
method has the additional advantage of making it easier 
to estimate the experimental error in -Fpro- 

For example, consider an n-qubit process, U. Suppose 
we select the Uj to range over the n-fold tensor products 
of Pauli matrices (including the identity matrix). Sup- 
pose furthermore that for each qubit we select the input 
states from the set {/, I+X, I+Y, I+Z} (where X, Y, Z 
are the usual Pauli operators), so that we choose pk from 
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the set of all possible tensor products of the single qubit 
input states. Now, choosing 07 = J2 k a,kiUUkU\ we see 
that the au will always be real, and since the Uk are 
Hermitian then the ui are also Hermitian. Thus Eq. (TT2]) 
tells us that we need to estimate only d 2 observable av- 
erages to evaluate F pro for any U , much fewer than the 
d 4 — d 2 observable averages necessary to do full process 
tomography on n qubits. 

It is an interesting problem deserving further explo- 
ration to find the minimum number of measurements re- 
quired to estimate -Fpro when there are constraints on 
what input states and observables are available. For in- 
stance, it would be useful to know the optimal number 
for the case where we are restricted to separable inputs 
and product observables, i.e., inputs and observables that 
can be given direct local implementations. 



B. Error measures based on the average case 

Another natural approach for defining error measures 
for quantum operations is to compare output states and 
average over all input state, where the output states can 
be compared using the distance measures for states de- 
scribed in Section [TTT1 We define 



A ave (£,T)= / #A(£(V0,^(V0), 



(13) 



where the integral is over the uniform (Haar) measure on 
state space. 

While this approach seems intuitively sensible, it turns 
out that the resulting measures satisfy few of our criteria. 
The only two properties these measures appear to satisfy 
in general, for an arbitrary state metric A, are the metric 
and chaining criteria, both of which follow immediately 
from the metric property of A. 

The average-based metrics are less successful in meet- 
ing the other criteria. Even when A is easy to calculate, 
it is not obvious that the integral in Eq. (TT3]) will have a 
simple form that enables easy calculation of A ave . This, 
in turn, means that A ave may not be so easy to deter- 
mine experimentally. So far as we are aware, no simple 
expressions are known for A ave for any of the metrics we 
have discussed. 

It is not surprising that the physical interpretations 
of these metrics rely heavily on the possible interpreta- 
tions of the corresponding state metrics as discussed in 
section HTT1 The earlier discussion of the trace distance, 
for example, follows on to give a meaning for D ave . Sup- 
pose we are asked to distinguish between £(ip) and ^(ip) 
for some ip which is known, but has been chosen uni- 
formly at random. On average, the optimal probability 
of successfully distinguishing the two processes will be 
l/2+D ave (£, F)/2. Thus, D ave (£,F) may be interpreted 
as a measure of the average bias in favour of correctly 
distinguishing which process was applied to a state ip. 
With regard to the fidelity-based metrics, however, there 
does not appear to be any clear physical interpretation 



for A ave because of the lack of any clear meaning for the 
fidelity-based metrics. 

Finally, completing the checklist of criteria, our numer- 
ical analysis shows that A avo is not stable for any of the 
four candidate state metrics we've investigated. Later 
in the paper we describe in detail a method for "sta- 
bilizing" measures which are not stable; we now briefly 
note the results that are obtained when this procedure 
is applied in the present context. The idea is to in- 
troduce an ancillary system A, and consider the quan- 
tity Astab-avc^,^ 7 ) = limA avc (Z (g> £,I <£> T), where 
the limit is that of large ancilla dimension. Using the 
well-known result that a randomly chosen chosen state 
of a composite system AQ (dim A 3> dimQ) has very 
close to maximal entanglement J4(| [47|, it follows that 
Astab-avo^,^ 7 ) = A pro (£,.F), i.e., the stabilized aver- 
age distance reduces to the process distance considered 
earlier. 

There is an alternative approach, available because the 
fidelity-based metrics are nonlinear functions of the fi- 
delity, which is to create a measure based on the average 
fidelity: 



F ave (£,F) = / #F(£W0,JF(V0). 



(14) 



When T is a unitary operation, U, the average fidelity 
has a physical interpretation that is at least plausible, 
as the average overlap between U\ip) and £ {ip). It was 
shown in Ref. 0] (see also Ref. [Tg) that F E 
are related by the equation 



re and Fp ro 



F ave {£,U) 



F pio (£,U)d+l 



1 



(15) 



where d is the dimension of the quantum system, and 
we are restricting ourselves to the case where U is a uni- 
tary operation. This relationship makes F ave (£, U) easy 
to calculate [l9|, [10] and also easy to measure experi- 
mentally, using the techniques described in the previous 
subsection for F pro (£ , U). 

Although F avc has several advantages (ease of calcula- 
tion, ease of measurement, and a physical interpretation), 
the outlook for the other criteria is not so good. Not only 
is F avo not a metric, it is not stable either, a fact that 
follows from Eq. (fT5|) and the knowledge that F pro is sta- 
ble. The same argument shows that measures analogous 
to A, B, and C based on F avo will also not be stable. We 
do not know of any stable metrics that may be derived 
as a function of F ave , and Eq. (fTS"]) renders any such met- 
rics equivalent in content to functions based on Fpro so 
the only reason to use them would be if they had better 
characteristics. 

To summarize the results of this section, they show 
that none of the average-case error measures we have de- 
fined are particularly attractive. However, these negative 
results are vital because these approaches are all fairly 
natural solutions one might take to defining a plausible 
error measure. It was therefore important to consider 
them carefully before choosing to reject them. 
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C. Error measures based on the worst case 

Our final approach to defining error measures is based 
on the worst case distance between £ (ip) and F(ip). We 
define 

A ma>L (£,F) = maxA(£(^),^W), (16) 

where the maximum is over all possible pure state inputs, 
tp, and A is a metric on quantum states. 

When A = A F is a fidelity-based metric, we see A^ lax 
is a function of the minimal fidelity, defined by 

F min (£,T) = minF(£ (V),^))- (17) 

In the definition of A max , we maximize over all pure 
state inputs. Is this maximum the same if all physical 
inputs, including mixed states, are considered? In fact, 
it is fairly simple to show that this is true, and therefore 
that it does not matter if we optimize over pure or mixed 
states [Ijl . Suppose A is a doubly convex metric, as are 
all the metrics discussed in this paper (c.f. Sec. IIIip . If 
the maximum is achieved at some mixed state, p, then we 
have A max = A(£(p),T(p)). Expanding p = J^jPi^j as 
a mixture of pure states, and applying double convexity 
we see that the maximum must also be attained at some 
pure state tpj. A similar argument holds for .F m in; based 
on the double concavity of the fidelity. 

To assess the suitability of these measures, it is useful 
to first note that Z^max has already been shown in general 
not to be stable [lfj , and similar arguments can be made 
to extend this to the fidelity-based measures. In Ref. [l(|, 
Aharonov et al. resolve this difficulty by constructing a 
variant of -D ma x which is stable, but which otherwise has 
extremely similar properties to D max . We now describe 
how this procedure can be extended to define a stable 
version of A max for an arbitrary state metric A, and defer 
for the moment discussion of the other criteria. 

Suppose the original system Q on which E and T act 
has state space dimension d. It will be convenient to use 
subscripts to indicate the system on which operations 
act (e.g. £ — £q, T = Tq). We introduce a fictitious d- 
dimcnsional ancillary system A, acted on by the identity 
operation Fa, and define the stabilized quantity [50j 

A stab (£ q,F q ) = A max (X A ® £q,Ia ® F Q ). (18) 

The proof that A sta b is stable under addition of systems 
is simple and has been included in Appendix IA 11 In 
the same way, we can also define a stable form of the 
minimum fidelity, F stah (£ Q ,F Q ) = F min (l A <g> £ Q ,I A ® 
Fq), with the proof of stability following similar lines. 
Note that the stabilized fidelity-based metrics A^ ab are 
functions of -Fstab in the obvious way (e.g. we define as 
usual A^b, S sta b and C sta b)- 

Which of the other criteria for an error measure does 
A s tab satisfy? It is straightforward to show that A sta b 
satisfies the metric and chaining criteria. Furthermore, 



the stabilized trace-distance -D s tab has an appealing phys- 
ical interpretation — it is the worst-case bias in the prob- 
ability of being able to distinguish {I <£> £)(ijj) from 
(X <X> F)(ijj), where we allow an ancilla of arbitrary size. 
We defer discussion of the physical interpretation of the 
fidelity-based measures until the next section, where we 
will see that both they and -D s ta,b can be given an elegant 
interpretation in the context of quantum computation. 

What of the remaining criteria, ease of calculation and 
case of measurement? Unfortunately, no powerful general 
formulae for calculating A sta b arc known. Reference fioj ] 
gives a general formula for the distance -D s tab between 
two unitary operations, but the more interesting case of 
the distance between an idealized unitary operation and 
a noisy quantum process has not been solved, even for 
singlc-qubit operations. 

The good news is that D s tab and -Fstab (and thus 
Astab j -Bstab and Cgtab) are easy to calculate numerically, 
because they can all be reduced to convex optimization 
problems [5l|. For this special class of problem, where 
the task is to minimize a convex function defined on a 
convex set, extremely efficient numerical techniques are 
available. Among many other nice properties, it is pos- 
sible to show that a local minimum of a convex opti- 
mization problem is always a global minimum, and thus 
techniques such as gradient descent typically converge ex- 
tremely rapidly, with no danger of finding false minima. 
In Appendix IA 21 we prove explicitly that finding F sta b 
belongs to this class of problems, and the proof for -D s tab 
follows similar lines. 

We have seen that numerical calculation of -D s tab and 
Fstab can easily be carried out, and this enables a two- 
step procedure for experimental measurement of either 
quantity — process tomography, followed by a numerical 
optimization. Of course, finding general formulae along 
the lines of F pro (£, U) or D pm is still a highly desirable 
goal. Aside from the intrinsic benefit, finding general 
formulae would simplify the experimental measurement 
and determination of error bars for -D s tab and -F s tab , and 
perhaps obviate the need for a full process tomography, 
as Eq. (TDD did for F plo (£, U). 



V. APPLICATION TO QUANTUM 
COMPUTING 

Can we find a good physical interpretation for any of 
the error measures that we've identified? In this section 
we will focus on interpretations that arise within the con- 
text of quantum computation and we will find that of 
the error measures we have discussed, four have particu- 
larly outstanding properties: D pro , F pro , D s tab and F sta b- 
(Note that in the case of the fidelity, it will actually be 
more convenient to state our results in terms of the equiv- 
alent measures C pro and C s t a b-) 

Assessed according to the criteria described in the in- 
troduction, these four measures have already been found 
to be superior to all the other measures we have stud- 
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icd. The additional fact that each arises naturally in 
the context of quantum computation strongly indicates 
that these four measures are the most deserving of con- 
sideration as measures of error in quantum information 
processing. We will return in the conclusion, Sec. IVI1 to 
the question of which of these four measures is the best 
possible measure of error. 

There are a variety of different ways of describing quan- 
tum computations, and it turns out that each of the four 
error measures arises naturally in different contexts. We 
will discuss separately two broad divisions of quantum 
computation, function computation and sampling compu- 
tation looking at both worst-case and average-case per- 
formance for each division. 

Most algorithms on classical computers are framed as 
function computations. We will see that our error mea- 
sures can be given particularly compelling interpretations 
relating to the probability of error in a function compu- 
tation. However, in the context of simulating quantum 
systems it is often more natural to consider sampling 
computations, where the goal is to reproduce the statis- 
tics obtained from a measurement of the system in some 
specified configuration. Again, we will see that our error 
measures can be given good interpretations in this con- 
text, albeit somewhat more complex interpretations than 
for function computation. 

The reason for treating the two types of computation 
separately is at least partially a practical one, since both 
types of computation arise naturally in the context of 
quantum computation. However, a more fundamental 
reason is that it does not appear to be known how to 
reduce sampling computation to function computation. 
Rather remarkably, even when there is an efficient way 
of computing a probability distribution, there does not 
appear to be any general way to convert that into an 
efficient way of sampling from that distribution. 



operation £ is performed. A good measure of error in 
the real computation is the actual probability p e that 
the measured output of the computation is not equal to 
/(x). In Appendix IB 11 we show that 



Pe < P?+D stah (£,F) 



< 



slab 



(19) 
(20) 



Which of these inequalities is better depends upon the 
exact circumstances. For example, when p 1 ^ = 0, we see 
that which inequality is better depends upon whether 
D s tab(£ > F) is larger or smaller than C s t&h{£, F) 2 - With 
Eq. ([5|) in mind, it is not difficult to convince oneself that 
cither of these possibilities may occur. 

Function computation in the average case: Once again 
our goal is to compute a function f[x) using an approx- 
imation £ to some ideal operation JF. However, we now 
look at the average-case error probability p e that the 
measured output of £ (|x)(x|) is not equal to /(x), where 
the average is taken with respect to a uniform distribu- 
tion over instances x. Correspondingly, we introduce p 1 ^, 
the average case error probability for the idealized oper- 
ation T. We show that (App. iBl]) : 



Pe 



< Pt 



(21) 



Unfortunately, we have been unable to develop a full nat- 
ural analogue of Eq. ([217)1 based on the fidelity. However, 
we have proved a partial analogue for when the ideal 
computation succeeds with probability one (pj, =0). In 
this case: 



% < C pm (£,T) 2 = l-F{£,T). 



(22) 



The proof uses very similar techniques to those used to 
establish Eqs. (f2"Tj) and (j2"0"l) . and is therefore omitted. 



A. Function computation 

In function computation, the goal of the quantum com- 
putation is to compute a function. /, exactly or with 
high probability of success. More precisely, the goal is to 
take as input an instance, x, of the problem, and to pro- 
duce a final state p x of the computer that is either equal 
to |/(x)), or sufficiently close that when a measurement 
in the computational basis is performed, the outcome is 
fix) with high probability. Grover's algorithm is usually 
cast in this way, where we want to determine the identity 
of the state marked by the oracle. 

Function computation in the worst case: Suppose we 
attempt to perform a quantum computation represented 
by an ideal operation T that acts on an input |x), where 
x represents the instance of the problem to be solved, 
e.g., a number to be factored [52j|. This process succeeds 
in computing fix) with an error probability of at most 
p 1 /, where 'id' indicates that this is the ideal worst-case 
error probability. Of course, in reality some non-ideal 



B. Sampling computation 

In sampling quantum computation, the goal is to sam- 
ple from some ideal distribution {p x {y)} = Px on mea- 
surement outcomes y, with x representing input data 
for the problem. For instance, x might represent the 
coupling strengths and temperature of some spin glass 
model, with the goal being to sample from the thermal 
distribution of configurations y for that spin glass. This 
type of computation is particularly useful for simulating 
the dynamics of another quantum system. 

Unlike Grover's algorithm, Shor's algorithm is usually 
described as a sampling computation. The goal is not to 
directly produce a factor or list of factors, but rather to 
produce a distribution over measurement outcomes. By 
sampling from this distribution and doing classical post- 
processing it is possible to extract factors of some number 
x. Of course, as noted in Ref. [53|, it is possible to modify 
Shor's algorithm to be a function computation, taking an 
instance x and producing a list of all the factors of x. 
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The desired result in sampling computation is that 
the measurement outcomes y are distributed according 
to the ideal probabilities p x (y), for a given problem in- 
stance x. Suppose, however, that they are instead dis- 
tributed according to some nonideal set of real proba- 
bilities q x (y). How should we compare these two distri- 
butions? There are two widely-used classical measures 
enabling comparison of probability distributions p and 
q. The first is the Kolmogorov or l\ distance, defined by 
D(p,q) = J2 y \p(v) - <l(y)\/ 2 - The second is the Bhat- 
tacharya overlap, defined by F(p, q) = yPWQW- 
Since these measures are in fact commutative analogues 
of the trace distance and fidelity, respectively, we rep- 
resent them with the same symbols as their quantum 
analogues (D and F). As with the trace distance, the 
Kolmogorov distance can be given an appealing inter- 
pretation as the bias in probability when trying to dis- 
tinguish the distributions p and q. No similarly simple 
interpretation for the Bhattacharya overlap seems to be 
known, although it is related to the Kolmogorov distance 
through inequalities analogous to Eq. ([5]). 

The Kolmogorov distance and Bhattacharya overlap, 
together with the quantum error measures we have in- 
troduced, can be used to relate ideal and real probability 
distributions obtained as the result of a quantum com- 
putation. 

Sampling computation in the worst case: Suppose we 
attempt to perform a quantum computation represented 
by an ideal operation T that acts on an input \x), where x 
represents the instance of the problem to be solved. The 
goal is to produce a final state ^ r (|x)(cc|) which, when 
measured in the computational basis, gives rise to an 
ideal distribution p x . Instead, we perform the operation 
£ , giving rise to a distribution q x on measurement out- 
comes. In Appendix IB 31 we prove that: 

max. D(q x ,p x ) < D atah (£,T) (23) 

X 

max[l - F(q x>Px )} < C stah {£,T) 2 . (24) 

X 

Just as for function computation, which of these is the 
better inequality depends upon the details of the situa- 
tion under study. 

Sampling computation in the average case: Given the 
same situation as for the worst case, we now assume 
that problem instances are chosen uniformly at ran- 
dom. We will therefore use the Kolmogorov distance 
and Bhattacharya overlap between the joint distributions 
{p(x,y)} = p and {q(x,y)} = q to measure how well £ 
has approximated T . Arguments analogous to that used 
in the worst case establish: 

D(q,p) < D plo (£,T) (25) 
1-F(q,p) < C pm (£,T) 2 . (26) 



VI. SUMMARY, RECOMMENDATIONS, AND 
CONCLUSION 

We have formulated a list of criteria that must be satis- 
fied by a good measure of error in quantum information 
processing. These criteria provide a broad framework 
that can be used to assess candidate error measures, in- 
corporating both theoretical and experimental desider- 
ata. 

We have used this framework to comprehensively sur- 
vey possible approaches to the definition of an error mea- 
sure, rejecting many a priori plausible error measures as 
they fail to satisfy many of our criteria. Although many 
of these rejected error measures are of some interest as di- 
agnostic measures, none are suitable for use as a primary 
measure of the error in a quantum information processing 
task. 

Four error measures were identified which have par- 
ticular merit, each of which satisfies most or all of the 
criteria we identified. These measures are the J distance 
(Jamiolkowski process distance), the J fidelity (Jami- 
olkowski process fidelity), the S distance (stabilized pro- 
cess distance) and the S fidelity (stabilized process fi- 
delity), denoted D plo , F plo , L> sta b and F sta b, respectively. 

All four measures either are metrics (in the case of 
the process distances) or give rise to a variety of as- 
sociated metrics (for the process fidelities). Moreover, 
all of the metrics can be shown to satisfy stability and 
chaining properties which greatly simplify the analysis of 
multistage quantum information processing tasks, as de- 
scribed in the introduction. The main differences arise in 
the criteria of easy calculation, measurement and sensible 
physical interpretation. We now briefly summarize these 
remaining properties for the four measures. Throughout 
this section, we assume that the goal in each case is to 
compare a quantum operation £ to an ideal unitary op- 
eration U ; the results vary somewhat when £ is being 
compared to an arbitrary process J- ' . 

(i) J distance: There is a straightforward formula en- 
abling D pro to be calculated directly from the process 
matrix, thus also allowing it to be experimentally deter- 
mined using quantum process tomography. The J dis- 
tance can be given an operational interpretation as a 
bound on the average probability of error p e experienced 
during quantum computation of a function, or as a bound 
on the distance between the real and ideal joint distribu- 
tions of the computer in a sampling computation: 

% < pf + D pm {£,U) (27) 
D(q,p) < D pio (£,U). (28) 

In the first expression p'f is the average probability of 
error in the ideal computation, represented by U. In the 
second expression, D(q, p) is the Kolmogorov distance be- 
tween the real joint probability distribution {p(x, y)} = p 
on problem instances x and measurement outcomes y and 
the ideal joint distribution {q(x, y)} = q, for a uniform 
distribution on problem instances. 
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(ii) J fidelity: Once again, the J fidelity can be calcu- 
lated directly from the process matrix. However, there is 
also a simpler formula for F pio , Eq. (fTTjl . allowing easy 
calculation and measurement, without the need for full 
process tomography. This is much more straightforward 
than the calculation for the J distance, and is likely to 
simplify the determination of experimental errors. As for 
the J distance, the J fidelity can be given an operational 
interpretation related to average error probabilities: 

P e < l-F pro (£,[/). (29) 
F(g,p) > F pio {£,U). (30) 

In the first expression we are now restricted to ideal com- 
putations U which succeed perfectly, i.e., p 1 ^ = 0. In the 
second expression, F(q,p) is the Bhattacharya overlap 
between the real and ideal joint probability distributions, 
p and g, again for a uniform distribution on problem in- 
stances. 

(iii) S distance: There is no known elementary for- 
mula for -Dstab, but we have proved that calculating the 
S distance is equivalent to a convex optimization prob- 
lem, which can be efficiently solved numerically, given 
knowledge of the process. This, in turn, enables D sta h 
to be measured experimentally, by performing full quan- 
tum process tomography. The S distance can be simply 
interpreted as a bound on the worst-case error proba- 
bility p e for a function computation, and as a bound on 
the maximum distance between the real and ideal output 
distributions of a sampling computation: 

Pe < p e d + D stab (£,U). (31) 
ma,xD(q x ,p x ) < D stah (£,U). (32) 

X 

In the first expression p 1 ^ is the worst-case error probabil- 
ity in the ideal computation, U . In the second expression 
D(q x ,p x ) is the Kolmogorov distance between the real 
and ideal output probability distributions {q x {y)} = q x 
and p x , and we take the worst case over all problem in- 
stances x. 

(iv) S fidelity: Once again, no elementary formula for 
the S fidelity is known, but we have proved that the 
determination of -F s tab can be formulated as a convex op- 
timization problem, and thus -F s tab can be efficiently de- 
termined numerically. As a result, i^tab can again be de- 
termined experimentally, using process tomography. As 
with the S distance, _F pro has an operational interpreta- 
tion related to worst-case error probabilities: 

Pe < (Jp^+C stah (£,U)^ . (33) 
mmF(q x , Px ) > F stab (£,U). (34) 

X 

The notation here is the same as above, with the defini- 
tion C atah {£ , U) = y/l-F Btah {S,U). 

Which of these four error measures is the best? Our 
recommendation is necessarily tentative, for we do not 
yet have a complete understanding of the properties of 



these measures. In particular, the discovery of simpler 
formulae for calculating the measures or simpler proce- 
dures for measuring them experimentally remain possi- 
bilities which could make it necessary to reconsider their 
relative merits. 

The fact that they all four measures obey the stability 
and chaining criteria means that in all cases it is only nec- 
essary to characterize the component processes in order 
to bound the total error in a complex quantum informa- 
tion processing task. This makes conceivable the idea of 
using these measures for assessing processes in large-scale 
systems. 

One important difference between the measures is that 
the S distance and S fidelity bound worst-case error 
probabilities, as compared to the average-case error prob- 
abilities for which the J distance and J fidelity provide 
bounds. This would seem to be a significant advantage 
for the S distance and S fidelity, since worst-case errors 
are usually of more interest than the average case. On 
the other hand, given the linear nature of quantum me- 
chanics, it seems likely that in low dimensions relatively 
tight ways may be found to use the average errors to 
bound the worst-case errors. 

The measure which is simplest to calculate is the J fi- 
delity, which has a simple formula, and is relatively easy 
to determine experimentally compared with the other 
measures. Unfortunately, this measure has the weakest 
operational interpretation of the four. As well as being 
only related to the average-case probability of error, our 
expression Eq. ((29)) does not hold true for function com- 
putations where the ideal case suffers an intrinsic error. 
For this reason we believe that the J fidelity is of par- 
ticular interest for early, proof-of-principle experimental 
demonstrations, but that other measures with more de- 
sirable properties will eventually supersede it. 

The J distance has different strengths and weaknesses 
than the J fidelity. On the one hand, it does allow the 
analysis of function computations with intrinsic errors in 
the ideal case. However, it requires a full process tomog- 
raphy to be determined experimentally, it is not as easy 
to calculate, and is still only related to average errors. 

The S distance and S fidelity have the most attractive 
operational interpretations, since they relate to worst- 
case error probabilities. Unfortunately, they are also 
more difficult to determine experimentally than the J fi- 
delity, requiring full process tomography, and no elemen- 
tary formula for either is known. However, they are easy 
to calculate numerically, and although full process to- 
mography is a time-consuming task, it is becoming a 
standard technique in quantum information experiments. 

On the basis of their compelling operational interpre- 
tations, and other attractive theoretical and experimen- 
tal properties, we believe that the S distance and S fi- 
delity are the two best error measures, and should be used 
as the basis for comparison of real quantum information 
processing experiments to the theoretical ideal. 

Is it possible to make a definite recommendation as 
regards which of these two measures to use? At the mo- 
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ment, we know of no convincing argument to choose one 
over the other. For instance, it is straightforward to find 
examples of different processes where either the S dis- 
tance or the S fidelity give the better bound in Eqs. (pTTj) 
and (|33|) . Further work on the relative merits of these 
measures is required before a definitive choice can be 
made. 

As a consequence, at the present time we believe that 
both measures should be reported in experiments. Note 
that determining two measures rather than one imposes 
little additional burden on experimentalists, since deter- 
mining either measure requires (at present) process to- 
mography to be performed, and once process tomography 
has been performed it is straightforward to numerically 
calculate both measures. 

Much work remains to be done. Tasks of obvious 
importance include: (a) obtaining closed-form formulae 
and simple experimental measurement procedures for the 
S distance and S fidelity; (b) finding procedures which 
can be used to calculate experimental error bars for the 
S distance and S fidelity; (c) expressing the threshold 
condition for fault-tolerant quantum computation and 
communication using the error measures we have iden- 
tified; and (d) extending our work so that it applies to 
quantum operations which are not trace-preserving, such 
as arise naturally in certain optical proposals for quan- 
tum computation [54|, [55[ , where measurements and post- 
selection are critical elements. 

Broadening the scope, it would also be useful to de- 
velop additional diagnostic measures, which could be 
used experimentally to understand and improve specific 
aspects of a process's operation, while not being suit- 
able as general-purpose measures of how well a process 
has been performed. An example of such a measure is 
the process purity, tr(/5£.), which can be regarded as a 
measure of the extent to which a quantum operation £ 
maintains the purity of the quantum state. Although 
this measure is easily seen to be deficient in terms of the 
criteria developed in the introduction, and thus is not 
suitable as a general-purpose measure, it may be useful 
as a diagnostic measure that provides information about 
one specific aspect of £'a performance. 
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APPENDIX A: WORST CASE PROOFS 

1. Proof of worst-case stabilization 

Let £q and Tq be trace-preserving quantum opera- 
tions acting on a d-dimensional system Q. We will show, 
following Ref. [l(|, that A sta b (£q , Tq ) is stable under 
the addition of an arbitrary e?'-dimcnsional system Q' , 

i.e, A stah (£ Q ,T Q ) = A s tab(2Q' ® £q,Iq> <E> Tq) 

To see this, recall the definition of A sta b(£Q, Tq). We 
introduce a fictitious d-dimensional ancillary system A, 
acted upon by the identity operation Xa- Then by defi- 
nition A stab {£ q, T Q ) = A max (lA ® £q,Ja <8> T Q ). 

By definition of A sta b we see that A stah (l Q > ®£ Q ,1 Q , (g> 
T Q ) is equal to \ xi ^{1b®Iq'®£q,'Ib®1q'®Tq), where 
Tb acts as the identity on a d x c?'-dimcnsional ancilla 
B. Thus, to prove stability it suffices to show that the 
quantity A mra (Is <g> £q,Is ® Tq) is independent of the 
dimension of the system S that T$ acts on, provided S 
is at least <i-dimcnsional. 

To see this independence, let tp be a state achieving the 
maximum in A. max (Is®£Q)ls®To), with a Schmidt de- 
composition ip — J^. ipj\&j}\fj), where |e 3 -) are orthonor- 
mal states of S, and \fj) is an orthonormal basis set for 
Q. Since Q is d-dimcnsional, the state ip has at most d 
Schmidt coefficients, and so we can restrict our attention 
to that d-dimensional subspace of S spanned by the states 
\ej) with nonzero Schmidt coefficients. We see that the 
maximum can be obtained working only in this subspace, 
concluding the proof. 



2. Proof of convex optimization property for F ata b 

Our goal is to show that the problem of computing 
FLtab can be reduced to the minimization of a convex 
function defined on a convex set. To show this we in- 
troduce a new function, denoted F(pq,£q,Tq), where 
subscripts indicate the system on which the variable is 
defined. The value of F(pq,£q,Tq) is defined to be the 
state fidelity F((1a <8> £q)(iP), (1a <£> Tq)(iP)), where A is 
an ancilla of at least the same dimension as Q, and ip is 
any purification of pq to AQ. It is easily verified that 
this definition is independent of which purification ip of 
Pq is used. 

From this definition, it can be seen that the prob- 
lem of computing F sta h(£Q, Tq) is equivalent to mini- 
mizing F(pq, £q,Tq) over all density matrices pq of sys- 
tem Q. Therefore, to prove that finding -F sta b is a con- 
vex optimization problem, we simply need to show that 
F(pq,£q,Tq) is a convex function of pq, which takes 
values in a convex set. 

To do this, let pj be probabilities, and let Pq be cor- 
responding states of the system Q, with purifications ipj 
to a system AQ. It is helpful to introduce another an- 
cillary system A' with an orthonormal basis \j) in one- 
to-one correspondence with the index on the states (Pq, 
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and we define a state \ip) = J^j y/Pjlj)^) 01 the joint 
system A'AQ. By observing that is a purification of 
^jPjpQ, we see that 

= F((1 A , A ® (Jam ® ?q)W)- (Al) 

We then apply the monotonicity of the fidelity (c.f. 
Sec. IIIip under decoherence in the \j) basis, giving 

X^liM®^®^)^-) ] . (A2) 



Finally, applying some elementary algebra to simplify the 
right-hand side, we obtain 



(A3) 

which implies that F(pq,£q,!Fq) is convex in pq, as de- 
sired. 

A similar construction shows that the computation of 
-D s tab is equivalent to the maximization of a concave func- 
tion over a convex set, and thus is also a convex optimiza- 
tion problem, with concomitant numerical benefits. The 
construction is sufficiently similar that we omit the de- 
tails. 



APPENDIX B: APPLICATION TO QUANTUM 
COMPUTING 



1. Function computation in the worst case 

Suppose £ and T are real and ideal quantum oper- 
ations, respectively, that act on an input |x), where x 
represents a problem instance. £ succeeds in comput- 
ing the desired function /(x) with an error probability 
of at most p e , whereas T succeeds with an (ideal) error 
probability of at most pjf. 

We wish to show: 



< 



< 



P, 



D atah {£,T) 

Cstab(^,-? r ) 



(Bl) 
(B2) 



To prove the first inequality, (|B1[) . wc introduce a 
quantum operation M representing the process of mea- 
surement, M(p) = ^2 y \y)(y\p\y)(y\, where the sum is 
over all possible measurement outcomes y. Now observe 
that 

p e = D((Mo£)(\x)(x\),\f(x))(f(x)\) (B3) 
< D((Mo£)(\x)(x\),(MoF)(\x)(x\)) 

+D((Mof)(\ x )(x\)),\f(x))(f(x)\) (B4) 



< 



D(£(\x)(x\),^(\x)(x\))+pl 



(B5) 



where we used simple algebra in the first line, the triangle 
inequality in the second line, and contractivity of trace 
distance and some simple algebra in the third line. The 
desired result, Eq. (|Blj) . now follows from the definition 

Of -Dstab- 

To prove the second inequality, Eq. (|B2[) . note that 



Pe 



= 1-F(£(\x)(x\),\f(x))(f(x)\) 
= C(£(\x)(x\),\f(x))(f(x)\) 2 
< [C(£(\x)(x\),T(\x)(x\)) 

+ C(.F(|x)(x|),|/(x))(/(x)|] 2 , 



(B6) 
(B7) 

(B8) 



where the first line follows from the definition of p e and 
the state fidelity, the second line follows from the defini- 
tion of the metric C(-, •), and the third line follows from 
the triangle inequality for C {■,■). The proof of Eq. (|B2p 
is completed by noting that C(£(|x)(x|), .F(|x) (x|)) < 
C stah (£,F) and C(^(|x)(x|),|/(x))(/(x)|) < jpf 



id 



2. Function computation in the average case 

As in the worst case, £ and T are real and ideal quan- 
tum operations that act on an input |x) to compute a 
desired function /(x). £ succeeds with an average error 
probability p e , whereas T succeeds with an average error 
probability pjf . 

The first steps in the proof of Eq. (f2"Tj) are directly 
analogous to the proof of Eq. ([TO)) , resulting in the in- 
equality 



Pr 



< Pf 



^D(£(\x){x\),^(\x)(x\)), (B9) 



where d is the total number of possible inputs x. Recall 
that 

D pio (£,F) = D{{I® £)($), (X ®JP)($), (BIO) 

where X acts on an ancilla which is a copy of the system 
£ and T act on, and |$) = \ X )\ X )/Vd is a maxi- 
mally entangled state of the two systems. Now let M. be 
a quantum operation representing measurement on the 
ancilla system, defined similarly to the definition of M. 
just above. By contractivity of the trace distance, 

D VIO {£,F) > D((M ® £)(*), (M ® ^)($)). (Bll) 
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Elementary algebra gives 

D((M ® £)($), (M (8 J=")(*)) 
= i^^d^^D^dx)^)). (B12) 

Combining these results, we obtain Eq. (|2ip . 

As already remarked we have not found a natural 
average-case analogue of Eq. (f2"P)) . However, if pg d = 0, 
i.e., our computation succeeds with probability one, then 
it is possible to prove an average-case analogue. The re- 
sult is 

P e < C pm (£,T) 2 = l-F(£,T). (B13) 

The proof uses very similar techniques to those used to 
establish Eqs. (f2Tj) and (j20|) . and is therefore omitted. 

3. Sampling computation in the worst case 

The quantum operation £ is an imperfect attempt to 
reproduce the statistics of the ideal operation T which 
acts on an input |a;). Measured in the computational 



basis, J- gives rise to a distribution {p x (y)} = p x , whereas 
£ gives a distribution {q x (y)} = Qx- 

The inequalities Eqs. (|23|) and (|24|) that we want to 
prove may be stated as follows: 

maxD(q x , Px ) < D Btah {£,F) (B14) 

X 

vamF{q x ,p x ) > F stah {£,F). (B15) 

X 

To prove the first inequality, (|B14j) . let M. again be 
a quantum operation representing measurement in the 
computational basis. Note that for all x 

D(q x ,p x ) = D((Mo£){\x){x\),{MoT)(\x)(x\)) 

(B16) 

< D(£(\x)(x\),T(\x)(x\)) (B17) 

< D stah (£,F), (B18) 

where we used simple algebra in the first line, contrac- 
tivity in the second line, and the definition of Z? s tab in 
the third line. An analogous argument can be used to 
establish the second inequality, (|B15p . 
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